A Bayesian approach to the follow-up of candidate gravitational wave signals 
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Ground-based gravitational wave laser interferometers (LIGO, GEO-600, Virgo and Tama-300) 
have now reached high sensitivity and duty cycle. We present a Bayesian evidence-based approach to 
the search for gravitational waves, in particular aimed at the follow-up of candidate events generated 
by the analysis pipeline. We introduce and demonstrate an efficient method to compute the evidence 
and odds ratio between different models, and illustrate this approach using the specific case of the 
gravitational wave signal generated during the inspiral phase of binary systems, modelled at the 
leading quadrupole Newtonian order, in synthetic noise. We show that the method is effective in 
detecting signals at the detection threshold and it is robust against (some types of) instrumental 
artefacts. The computational efficiency of this method makes it scalable to the analysis of all the 
triggers generated by the analysis pipelines to search for coalescing binaries in surveys with ground- 
based interferometers, and to a whole variety of signal waveforms, characterised by a larger number 
of parameters. 

PACS numbers: 04.80.Nn, 02.70.Uu, 02.70.Rr 
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I. INTRODUCTION 

Ground-based gravitational wave (GW) laser interfer- 
ometers - LIGO til, Virgo 0, GEO-600 0] and TAMA- 
300 Q ~ have been in operation for a few years, alter- 
nating times of data taking for science analysis at pro- 
gressively greater sensitivity and duty cycle, with com- 
missioning periods to improve the instruments' perfor- 
mance. Recently LIGO has completed its fifth science 
run (S5) which lasted for w 2 years, recording one in- 
tegrated year of data in triple coincidence between the 
two 4-km arm interferometers and the 2-km arm inter- 
ferometer at design sensitivity 0. In addition, GEO-600 
and Virgo were online for extended periods of S5. LIGO 
and Virgo are now being upgraded to the so-called en- 
hanced configuration Q, leading to a new science run 
(S6) in 2009 at a strain sensitivity a factor « 2 better 
than the present one. The much more intrusive upgrade 
to advanced LIGO/ Virgo 0, [1| is expected to lead to 
commissioning of the interferometers in 2014, and will 
increase the strain sensitivity by a factor « 10 across the 
whole frequency band and shift the low-frequency cut-off 
to about 10 Hz. 

The search for GWs has therefore reached a stage in 
which the first direct detection is plausible and there are 
expectations that by the time the instruments operate 
in advanced configuration it will be possible to routinely 
observe a wide variety of sources in this new observational 
window and study them in detail, see 0] and references 
therein. 

One of the most promising classes of sources are binary 
systems of compact objects, which are reviewed in [9[. 
GW laser interferometers are omni-directional detectors 
that continuously monitor the whole sky looking for rare 
and/or weak signals. The general approach to the anal- 
ysis of the data is to employ efficient algorithms able to 
keep up with the data flow and to identify events above 



a given signal-to-noise ratio threshold. The candidate 
events (or "triggers" ) are then followed up with a range 
of techniques to decide whether or not a GW signal is 
present. 

Most of the effort on the data analysis side has so far 
been devoted to the development and implementation of 
techniques for the mass analysis of the data. These algo- 
rithms have been reaching maturity and are be ing app lied 
to an increasingly large volume of data [l0l. [Tll . [l2[ .[T3[[l^: 
however comparatively less experience has been gained 
on follow-up methods, see e.g. Section VIII of Ref [l3| 
(and references therein) and Ref llSll. So far th e appro ach 
to candidate follow-ups is based [lOl. [isl . [l6l . \Va . [isj on 

(i) addressing the probability of false alarm of an event 
against the background, usually quantified by repeating 
the analysis on data from multiple interferometers shifted 
in time by an unphysical amount (so-called "time-slides"), 

(ii) looking for possible correlations between monitor- 
ing channels and the GW channel that could reveal an 
anomalous behaviour of the instrument (or sub-system) 
or anthropogenic causes around the time of the candidate 
and (iii) checking that the signal at different interferome- 
ters shows consistent behaviour. In this paper we propose 
a conceptually and practically very different approach to 
the follow-up of candidates: Bayesian model selection (or 
hypothesis testing) that is based on the evaluation of the 
marginal likelihood or evidence of a specific model and 
on the odds ratios between competing hypotheses. Note 
that in principle the method discussed here could also 
be used to search for signals in the whole data set; how- 
ever the computational costs involved in adopting such 
an approach and the performance of the existing analy- 
sis algorithms do not strongly support an effort in this 
direction at present. 

The key issue that we address in this paper is, given 
a data set and a set of prior information, how we cal- 
culate the probability of a GW signal being present. 



2 



In the formahsm of Bayesian inference this is trans- 
lated into considering two models - Model 1: "there is 
a GW signal and noise", and Model 2: "only noise is 
present" - and computing the probability of each of them. 
Bayesian inference provides a conceptually straightfor- 
ward prescription to evaluate the probabilities, or the 
odds ratio of the two models. In the context of GW 
observations, Bayesian inference is starting to be con- 
sidered [H, llO, Hi], [13, HI] as a powerful approach for 
ground-based observations. The heavy computational 
load involved in this method when using exhaustive inte- 
gration has however limited its use on real data. Recently 
applications of reversible jum ps Markov-Chain Monte- 
Carlo methods (MCMC) [2J| have been considered to 
tackle this issue in the context of searches for binary in- 
spirals (2^ : in this approach Markov-chains arc free to 
move between models, and therefore one can estimate the 
Bayes factor from the relative time spent by the chains 
in each one. This technique is promising, still the com- 
putational burden is quite significant. In this paper we 
consider a different and efficient numerical implementa- 
tion of the direct calculation of odds ratios that makes 
this approach realistically applicable to extensive follow- 
up studies of triggers and that we show is robust against 
a selection of noise artefacts. 

The paper is organised in the following way: in Sec- 
tion [TT] we introduce the key concepts about model selec- 
tion and hypothesis testing in the Bayesian framework; 
we also introduce the simple case possible observations 
of inspiralling binary systems in a single interferometer 
- that we will consider in the paper; Section IIIII con- 
tains the key results of this paper, where we show the 
power and robustness of Bayesian hypothesis testing for 
follow-up studies; Section IIVI contains our conclusions 
and pointers to future work. 



II. MODEL SELECTION 
A. Overview 

Let us consider a set of hypotheses (or models) {H} 
and denote with / all relevant prior information we hold. 
The predictions made by a model depend on a set of 
or more parameters, and the possible combinations of 
parameters define the parameter space of that model. A 
parameter vector represents a point in parameter space, 
and although each model has its own set of parameters, 
for ease of notation we shall use 9 for them all, and we 
represent the data under consideration by d. 

A straightforward application of Bayes theorem yields 
the probability of a given model: 



normalised and satisfy: ^jLi P{Hj\d, I) = 1. In Equa- 
tion [1] P{Hj\d, I) is the posterior probability of model 
Hj given the data, P{Hj\I) is the prior probability of hy- 
pothesis Hj and P{d\Hj , I) is the marginal likelihood or 
evidence for Hj that can be written as: 

P{d\H,J) = C{H,) 

d9p{9\H„I)p{d\lH,,I), (2) 



P{Hj\dJ) 



P{H,\I)P{d\H,J) 
P{d\I) 



(1) 



If the models under consideration are exhaustive and mu- 
tually exclusive, then the probabilities above are clearly 



where p{9\Hj,I) is the prior probability density distri- 
bution for the parameter vector 9 and is normalised to 
1. p{d\9, Hj, I) is called the likelihood, and is an un- 
normalised measure of the fit of the data to the model. 

If we had an exhaustive set of models, we could sim- 
ply calculate the probability of each model and compare 
them to find the most likely. Unfortunately we do not 
have a complete set of models of the data from a GW 
detector, but we still want to compare the models we do 
have. This is a normal procedure in Bayesian inference, 
which gives what is termed the odds ratio between hy- 
potheses, which is a quantification of their relative prob- 
ability. 

For example, the odds of model Hj against model Hk 



0,k = 



P{Hj\d,I) _ P{Hj\I) P{d\Hj,I) 



P{Hk\d,I) 
P{H,\I) 
P{Hk\I) 



P{Hu\I) P{d\Hk,I) 



B 



jk 



(3) 



where P{Hj\I)/ P{Hk\I) encodes the ratio of the 
prior state of belief of the two models and Bjk = 
P{d\Hj , I) / P{d\Hk, I) is known as the Bayes factor; the 
marginal likelihoods P{d\Hj k, I) are given by Eq. 
The practical advantage of considering the odds ratio ([3]) 
is the fact that one does not need to evaluate P{d\I). Of 
course, if one wanted to evaluate the odds of model Hj 
with respect to all the other independent alternatives, 
the appropriate quantity is: 



O 



J, Other 



PiH,\d,I) 



(4) 



An interesting consequence of using Bayesian model se- 
lection is that it automatically includes a quantitative 
version of "Occam's Razor", the principle that the sim- 
pler model should be prefered. Because the evidence is 
found by integrating over the entire prior domain of a 
model, and this prior is normalised to unity, the larger 
the volume of parameter space which the model spreads 
its prior over the lower the resulting evidence will be, 
all else being equal. If two models can assign the same 
maximum likelihood to some data by fitting it equally 
well, the one which makes the more precise prediction of 
the parameters which produce the maximum likelihood 
will benefit the more than one which makes the broader 
prediction, [sol. 
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B. A concrete example 



where 



The specific problem that we are addressing here is 
how to answer the question: what are the odds that there 
is a signal present in the given observations? We actually 
need to be more specific and spell out the whole set of 
background information / for the problem at hand. Here 
we concentrate on searches for inspiralling compact ob- 
jects, though the method outlined here has the potential 
of much wider applications. The background information 
/ is therefore as follows: (i) the data set d consist of the 
superposition of noise n and (possibly) an inspiral GW 
signal h; (ii) if present, only one GW signal is present at 
any one time; (iii) the waveform model is exactly known, 
though the actual parameters characterising the source 
are unknown, within some prior range; (iv) gravitational 
radiation and noise are statistically independent; (v) the 
noise is a Gaussian and stationary random process with 
zero mean and variance (at any frequency) described by a 
known spectral density; (vi) observations are carried out 
with a single instrument. 

We are therefore considering a situation in which 
(schematically) there are only two models: 

• The first hypothesis - that we will label as Hs, for 
noise and GW signal - corresponds to "there is a 
signal and noise present" ; the data set in the time 
domain is therefore described by 



d{t) = n{t) + h{t). 



(5) 



• The second hypothesis - that we will indicate as 
H^, for noise only - corresponds to "there is only 
noise present" , and the data set is therefore: 



d{t) = n{t). 



(6) 



The goal of the analysis is therefore to compute the odds 
ratio. Equation ([3]), between Hq and that we will 
indicated with Osn- 

In this paper, for simplicity we consider the obser- 
vations of gravitational radiation produced during the 
inspiral phase of a binary system of non-spinning com- 
pact objects. The two objects have individual mass 
TOi^2j and the binary's chirp and total mass are there- 
fore M = {mim2Y {nil + m2)~^^^ and m = mi -\- m2, 
respectively. We have also assumed mi = TO2 . We model 
radiation at the leading Newtonian quadrupole order. As 
the analysis is more conveniently carried out in the fre- 
quency domain, the model that we adopt is the station- 
ary phase approximation to the radiation in the Fourier 
domain, see e.g. [3l|. Describing with g{f) the Fourier 
transform at frequency / of the time-domain function 
g{t) we can express the GW signal as: 







(/ < /lso) 
(/ > /lso) 



(7) 



V(/;e) = 27r/t,-0e 



(8^A1/)- 



-5/3 



(8) 



is the signal phase in the frequency domain and /lso is 
the frequency of the last stable orbit; in this paper we 
set /lso equal to the last stable circular orbit for the 
Schwarzschild space-time and equal mass non-spinning 
objects, so that 



hso = i6'^'2'/'nMr 



(9) 



In Eqs ([7]) and (O, 6* is the 4-dimensional parameter vec- 
tor describing the GW signal; as parameters we choose 



= {A,M,t„(j>,}, 



(10) 



where A is an overall amplitude that depends on the 
source position in the sky and the inclination and po- 
larisation angle of the source, and tc and (pc are the 
time and phase at coalescence. Note that we are us- 
ing geometrical units in which c = G = 1 and therefore 
Mq = 4.926 X 10-6 s. 

We model the noise as a Gaussian and stationary ran- 
dom process with zero mean, and variance at frequency 

fk given hy al = (^j^^ "^l^ich roughly 

represents the shape of a first generation interferometer 
noise curve up to a multiplicative constant of the order 
of 10~**, which is irrelevant as it cancels in the likelihood 
function; /o = 150 Hz is chosen to pick the frequency of 
maximum sensitivity. 

In a real application, the variance of each point can 
be estimated from the one-sided power spectral density, 
calculated with Welch's method (for example) from the 
data around the segment of interest [1^1 ■ Notice that in 
the next section we will consider deviations of the actual 
noise affecting the measurements (but the noise model 
used to construct the likelihood will remain unchanged) 
from the Gaussian and stationary noise shown above. We 
will discuss this in more detail in Section [IIII 



C. Evaluation of the odds ratio 

We can now spell out the odds ratio that needs to be 
calculated; from Equation ([5]), this is given by 



O 



SN 



P{Hs\I) P{d\HsJ) 
PiH^\I) P{d\H^J) 



(11) 



For the noise model in this case, there are no free param- 
eters since the noise profile is known, and the evidence is 
given for Gaussian noise by. 



N 



P{d\H^,I)^Y[{2Trc7l) cxp 



fc=i 



(12) 
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where the index fc is a short-hand representation for those 
quantities dependent on frequency //; , and N is the total 
number of data points. Each point k has a real and 
imaginary observation, the variance of each part will be 
equal in any realistic data, cr^ = (t|^ = (t|j, although in 
principle they could differ. Evaluating the evidence for 
this case requires no integration. 

For the signal model however, the evaluation requires 
integration over the prior domain (denoted 0) of all pa- 
rameters, codified in a vector 9 and is given by 

P{d\Hs,I) = / d9p{9\Hs,I)Y[{27Tal) ' 



X exp 



hk{9)-dk\ 



V 



2-^ 



(13) 



The prior probability density function p{9\Hg, I) is for 
this proof of concept case uniform on all four parameters, 
within the domain defined below. It should be noted that 
the prior must be normalised to one when computing the 
Bayes factor, and that increasing the range of the prior 
will decrease the Bayes factor, as the model becomes less 
predictive and is penalised in automatic accordance with 
Occam's razor. 

In the case of the Newtonian inspiral model, the com- 
putation of the marginal likelihood p3p requires the eval- 
uation of a 4-dimensional integral which could be calcu- 
lated using a grid-based approach. However we are in- 
terested in developing a general and flexible approach, 
and for inspiral binaries we will need to expand the evi- 
dence calculation to include more realistic waveforms for 
the analysis of real data with post-Newtonian effects and 
spins, which have many more parameters: the integral in 
Eq. rapidly becomes unfeasible to integrate exhaus- 
tively. To avoid this problem, we have used a probabilis- 
tic algorithm called Nested Sampling [2^, [2^, [23| , which 
has been used before in the context of Bayesian hypoth- 
esis testing in cosmology [l^, but not considered so far 
in gravitational wave data analysis. The application of 
this technique has required significant development and 
tuning of the algorithm; the details of this work are be- 
yond the scope of this paper and will be documented in 
a separate publication [20|. Here we shall focus on the 
advantages of using the evidence in the context of inspi- 
ral analyses, which in the next section will be presented 
with discussion of results from some fiducial data analysis 
problems. 



III. RESULTS 

In this Section, we will document the results of four 
particular problems of interest on which the algorithm 
has been tested: (i) A test on pure Gaussian noise, (ii) 
the same noise with added signals of different strengths 



introduced in order to explore the sensitivity of the al- 
gorithm to different signal-to- noise ratios. In addition to 
these cases of obvious relevance, we have also considered 
the situation where some noise artefacts - not included 
into the model for the computation of the marginal likeli- 
hood and the Bayes factor - were added to the data (that 
is to the simple Gaussian and stationary noise contribu- 
tion). In particular, (iii) a decaying sinusoid waveform 
which might approximate an instrumental ringdown, and 
(iv) a Poissonian noise component. The two latter cases 
in particular are intended to look at the situation where 
the models to be tested do not fit the data well at all, in 
other words they are not exhaustive. This is of interest 
because in reality there will be more types of interference 
with the data than we could possibly hope to model, so 
robustness against such artefacts is an essential charac- 
teristic of a search algorithm. This is directly related to 
the false-alarm rate of existing searches. 

In this paper, we have used synthetic data sets in the 
frequency band 40 -500 Hz, and used 30 s of data. The 
priors defining the range of the model were such that 
A £ [0,2 X 10^], M G [7.7, 8.3] M0, £ [19.9, 20.1] s, 
4>c S [0, 27r). The large amplitude reflects the conversion 
of mass into seconds in Equation ([7]). The injected value 
M = 8.0 Mq was chosen purely to speed up the analysis, 
as its highest frequency fLSo{M = 8.0 M©) = 478.45Hz 
allowed frequency components above this value to be 
eliminated from the innermost loop of the calculation. 
These prior ranges were chosen to approximate the size of 
the parameter space that would have to be searched in a 
real application, which we suggest would be run on candi- 
dates triggered by a higher stage in a pipeline, providing 
some information about the possible characteristics of the 
signal, namely A4 and tc- On these data sets and with 
such prior ranges, the calculation of the evidence took 
between several minutes and several hours on a single 2 
GHz CPU; the efficiency of the algorithm varies however 
with the specific tuning in a complicated fashion which 
will be described in a future paper [2^ . The speed of the 
computation makes this approach amenable to extension 
to waveforms characterised by a much larger number of 
parameters, and to full lists of triggers generated by an 
inspiral search pipelines, see e.g. Refs. [lOl [Til. [l3| . 



A. Sources of Uncertainties 

Before presenting the results of the example trials, we 
shall emphasise that there are two contributions to the 
distribution of results which are obtained when using this 
method on any data set. These are the inherent uncer- 
tainties from the probabilistic nature of the algorithm 
itself, and the variations in results due to different noise 
realisations, produced by the random nature of the noise. 

The contribution from the noise is an inherent part 
of any search and the results will naturally vary from 
dataset to dataset depending on the particular observa- 
tions made. This cannot be averted, nor should it be. 
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The uncertainty inherent in the algorithm itself however, 
is a true "error" in the result, since there is some par- 
ticular number which represents the odds ratio, and any 
deviation from this is undesired. Fortunately, by increas- 
ing the running time of the algorithm, these errors can be 
minimised to a level similar to, or below the variation due 
to different noise realisations. The inherent uncertainty 
can in principle be made arbitrarily small, but in the in- 
terest of keeping a low run-time, in this paper we have 
chosen a level roughly similar to the changes caused by 
the random fluctuations of the noise; this is appropriate 
for the issues discussed here, and none of our conclusions 
are affected by the approximation in the evaluation of 
the integral in Eq. (fT3|l . 

We have run two simple test cases to illustrate the dif- 
ferent sources of uncertainties that affect the results; in 
both cases we have run the algorithm a number of times 
on signal-free data sets; in the first instance, data sets 
were generated to contain different realisations of Gaus- 
sian and stationary noise; in the second, the evidence 
evaluation was performed multiple times on the same 
noise realisation; Figures [1] and [2] summarise the results. 
Figure [1] shows the distribution of logj^g Bayes factors 
recovered from the analysis of 100 different noise realisa- 
tions, and represents the background level of changes in 
the odds which is caused by random fluctuations in the 
noise. Figure [2] shows the inherent uncertainty caused 
by the algorithm when repeatedly running on a single 
noise realisation of the same characteristics. The distri- 
bution is shown for 200 runs, and the accuracy level has 
been chosen such that the errors from this effect arc not 
greater then those in Figure [1] 



B. Stationary Gaussian noise with no signal 

The most basic test of whether the evidence-based ap- 
proach can be helpful in a follow-up analysis is to find 
how it responds in the absence of a signal. At the present 
time, this is still the routine situation in gravitational 
wave data analysis. The results shown in Figure [2] are 
the values of the Bayes factor, cf. Eqs ^ and pT|) . be- 
tween the signal hypothesis and the noise hypothesis, as 
calculated by performing 200 runs on identical noise re- 
alisations. 

The recovered Bayes factors for the hypotheses favour, 
as expected, the noise only hypothesis, but are (rela- 
tively) close to unity. This number is then multiplied by 
a prior odds ratio, sec Eq. to get a total posterior 

odds ratio. A reasonable prior would give much larger 
credence to the hypothesis that no signal is present (i.e. 
have a value ^ 1), in order to reflect the fact that at 
present sensitivity, we expect inspiral binaries to be rare 
events. The magnitude of this prior ratio is a factor which 
quantifies the scepticism of the analysis toward the sig- 
nal hypothesis, in effect creating a threshold Bayes factor 
above which the observed signal dominates the prior dis- 
belief. This number can also be chosen from a procedural 
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FIG. 1: The distribution of log^Q Bayes factors recovered from 
running the algorithm on 100 different Gaussian and station- 
ary noise realisations, each of length 13 800 complex samples 
as described in the text. The distribution in values arises 
from random fluctuations of the noise, which is contrasted 
with Figure (2] where the inherent uncertainty in the algo- 
rithm is shown. In both cases, the variation in Bayes factors 
found in datasets with no signal is extremely small compared 
to the change in Bayes factor that the presence of a signal 
produces, which is shown in Figure |3] for a range of signal-to- 
noise ratios. 



point of view, so as to obtain a desired "false alarm rate" , 
by performing simulations where known GW signals are 
added to the data either in hardware or in software. 

In the next Section we test the response of the al- 
gorithm to a GW inspiral signal added to noise, using 
signal-to-noise ratios ranging from 1 to 7. From this we 
shall sec that the evidence is an extremely sensitive in- 
dicator of the presence of a signal, indeed it is provably 
the optimal inference for model comparison [30l |. 



C. Signal injected into stationary Gaussian noise 

In this section we discuss the results obtained by 
adding GW inspiral signals, with signal-to- noise ratio 
(SNR) varying between 1 and 7, to stationary and Gaus- 
sian noise, of the same statistical properties as in section 
IIIIBI The three non-amplitude parameters were kept 
the same for all of the injections, and had the values 
M = 8.0 Mq, tc = 20.0s, 0c = 0.0. The algorithm was 
run on each of these datasets 20 times, and the results 
are summarised in Figure [H The optimal signal-to-noise 
ratio [sH is given by 
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FIG. 2: The distribution of log^Q Bayes factors recovered from 
running the algorithm 200 times on the same dataset which 
contained no signal. The noise is modelled as a Gaussian 
and stationary random process. The distribution of results 
is caused by the probabilistic nature of the algorithm, and 
the range of this distribution can be reduced to have an arbi- 
trarily small range at the expense of increased computation 
time. In this example and throughout the rest of the paper, 
the distribution width is chosen to be the same order as the 
uncertainty caused by different noise realisations, shown in 
Figure [T] 



SNR: 



N 

fc=l 



(14) 



It is very clear from Figure [3l that the Bayes factor 
rapidly increases with SNR, hence the logarithmic 
scale on the vertical axis. This shows that the odds ratio 
between the two candidate models in this case climbs 
very rapidly to favour the signal model Hs once sufficient 
evidence is present. 

We can now see the effect that the prior odds ratio 
has on the posterior odds ratio: it acts as a threshold 
value, above which the overall odds are in favour of the 
signal model. The value of the prior odds effectively sets 
a limit on SNR above which the odds favour the signal 
model, but once this threshold has been reached, the in- 
creasing Bayes factor will rapidly climb by many orders 
of magnitude. 

In the cases discussed above, the SNR is chosen by 
changing the overall amplitude of the injection, while 
holding the injected chirp mass constant between each 
case. It should be noted that there is an additional 
change in the odds ratio which comes about when the 
injected mass changes. This is due to the fact that the 
noise evidence P{d\H^, I) is dependent on the specific 



3 4 5 

signal to noise ratio 



FIG. 3: The mean recovered logjQ Bayes factors from adding 
an inspiral signal of increasing signal-to-noise ratio, shown 
on the X-axis, to Gaussian and stationary noise. The algo- 
rithm was run on each dataset (identical noise realisation) 20 
times: each point on the plot corresponds to the mean of the 
Bayes factor. Note that the Bayes factor grows exponentially 
with increasing SNR, such that it is an extremely sensitive 
indication of the presence of a signal. The signal and data 
parameters were chosen as described in the text. 



shape of the waveform that is present in the data and 
its inner product with the individual noise realisation. 
Writing the evidence for the noise model when the data 
is composed of noise and signal, dk = fik + hk, shows the 
origin of the effect: 



P{d\HN,I) oc J|exp 




nlhk + fikhl 



(15) 



The first term in bracket is a measure of the fit of the 
noise to the estimated background noise spectrum {crk} 
independent of the signal. The second term is the signal- 
to-noise ratio squared, which is independent of the noise 
realisation. The third term is the important one in this 
effect: it measures the contrast between the signal wave- 
form and the noise realisation. It is possible to have a 
constant SNR while the Bayes factor changes if this term 
varies, although because the noise and signal are uncorrc- 
lated {{hlhh) — {"fikhl) — 0), it should be small in com- 
parison to the other terms in this equation. This term 
can change through either differing noise realisations, or 
a change in the waveform's shape, the phase, or the over- 
all amplitude of the signal. Since we have used the same 
value of Ai in these simulations, this effect should be 
further reduced. 

The choice of the prior odds is an interesting question 
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which we do not attempt to answer conclusively here, 
but at least two possibilities seem reasonable. Since the 
prior hypothesis probability P{Hs\I) on the signal model 
reflects our knowledge before we examine the dataset, we 
could consider our knowledge of the rate of inspiral events 
which would be visible to the interferometer in question 
(see e.g. [H] and references therein), and the length of 
the prior window on tc. Multiplying these would give us 
an estimate of the number of inspirals we would expect 
to see in that time period and mass range, which could 
be used as the prior odds on seeing an inspiral event in 
that time. 

Alternatively, one could perform a large number of tri- 
als on different data realisations and find the distribution 
of Bayes factors for "false alarm" signals with their fre- 
quency. A desired false alarm rate could then be set 
by choosing the prior odds ratio to be the inverse of the 
Bayes factor for that false alarm rate, see e.g. [3§|. In real 
data where it is not known a-priori if a signal is present, 
were the search algorithm extended to coherent multiple 
interferometer models, this could easily be accomplished 
by offsetting the data from two or more interferometers, 
so that any real signal would not appear with phase co- 
herence. This is similar to the approach taken in the 
existing inspiral analysis pipeline to perform b ackg round 
rate estimation, see e.g. Refs. [H O 113, [13, \lM- This 
choice also has appeal as it would automatically include 
the effect of noise artefacts present in real data which 
might cause the odds ratio to increase even in the case 
where there is no astrophysical signal, partially matching 
the inspiral model. 

A more Baycsian treatment of spurious artefacts might 
attempt to model them too, and then compare their 
model evidences to those for the GW signal and noise 
models to test each hypothesis against the others. How- 
ever, it is unlikely to be possible to model every single 
noise artefact that appears in a real interferometer, so it 
is desirable that the basic algorithm functions well even 
in the presence of such disturbances. In the following 
sections we report results of tests of the performance of 
the algorithm in the presence of some typical (though 
simulated here) deviations of the noise from the simple 
Gaussian and stationary behaviour. As we will see, the 
algorithm is less sensitive to these noise artefacts than 
the true signal, though the Bayes factor decreases as it 
should. This result indeed supports the usefulness of 
evidence-based algorithms as part of the tools for follow- 
up analyses in searches for GWs in the data of current 
interferometers. 



D. Stationary Gaussian noise with ringdown 
injected 

One type of artefact which is often encountered in real 
data is the instrumental ringdown, where some compo- 
nent of the detector is excited and produces damped 
oscillations in the gravitational wave channel, gradually 



decaying. We have modelled the resulting signal for the 
purposes of adding it to the synthetic data set as a generic 
decaying sinusoid waveform in the frequency domain, 

R{f;ARjR,to,T)= J (16) 

The parameters of this signal are amplitude An, fre- 
quency ffi, starting time to, a-nd decay time r after which 
the envelope amplitude falls to of its original value. 
Using this waveform is equivalent to setting the initial 
phase of the wave to be 0, but this does not affect the 
generality of the results. In the simulations presented 
here, a central frequency fn = 150 Hz was chosen, so 
that the peak of the ringdown would lie in the sensitive 
region of the frequency domain range. Ringdowns with 
T = 1 s and 10 s were both added to the dataset, and the 
algorithm was run as before to search for the presence of 
a Newtonian inspiral. We stress again that the instru- 
mental ringdown was not part of the models considered 
for the computation of the Bayes factor. 

The results of this test are reported in Figure [U the 
signal-to-noise ratio of the instrumental ringdown shown 
on the X-axis is calculated in the same way as for the GW 
inspiral signal, using Equation 1161 but replacing h with 
R. The figure shows the response of the Bayes factor _Bsn 
of the inspiral model vs. noise model (for both of which 
the noise component is modelled as exactly Gaussian and 
stationary) to a dataset containing Gaussian noise and an 
instrumental ringdown signal, which is much more sub- 
dued than that to an injected inspiral signal. The vertical 
axis of the plot of Figure [1] is therefore the same quan- 
tity represented on the corresponding axis of Figure [S] A 
ringdown SNR of 1000 is necessary to produce a Bayes 
factor comparable to that for an inspiral SNR of 4.5 (c.f. 
figure [21), when a ringdown with decay constant t =10 s 
is used (for r =ls the effect is even smaller, and hardly 
noticeable on the scale of the plot). 



E. Stationary and Gaussian noise with additional 
Poissonian component 

We now explore a different case in which the model's 
assumption that the noise follows a Gaussian probabil- 
ity distribution is not fully accurate, by injecting an ad- 
ditional Poissonian noise component into the simulated 
data. To accomplish this, a time-series stretch of noise 
with amplitude drawn from a Poisson distribution in the 
time domain, scaled by a factor 100 to increase its ampli- 
tude, and uniform random phase in the interval [0, 2tt) is 
generated and Fourier transformed, before being added 
to the standard Gaussian stationary noise generated in 
the frequency domain. The root-mean-square of the 
Gaussian and Poisson components are 4.6 and 0.3 re- 
spectively. The spectrum of the Poisson noise showed an 
approximately uniform power across frequency in con- 
trast with the shaped power spectrum of the Gaussian 
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FIG. 4: The Bayes factors Bsn for inspiral model vs. noise 
model in the presence of an injected ringdown to simulate such 
events in instrumental noise. Shown here are the results from 
two ringdowns, with decay times r = 1 s (dashed line) and 
10 s (solid line) , and varying signal to noise ratios plotted on 
the X-axis. From these results it is clear that the algorithm 
is robust against interference from this source, as only the 
T = 10 s caused an increase in the Bayes factor, and then 
only at very high SNR. 
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FIG. 6: The results from injecting a Poissonian component 
with distribution P(O.l) into a segment of Gaussian noise (and 
no GW signal) and running the algorithm 100 times. The 
presence of Poisson noise does not increase the Bayes factor 
for an inspiral signal, as the Poissonian noise does not match 
the template for this waveform. The results of plot should be 
compared with those obtained on pure Gaussian and station- 
ary noise reported in Figure [5] 
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FIG. 5: The amplitude spectra of the Gaussian (upper black) 
and Poisson (lower grey) contributions to the data used in 
section IIIIEI 

noise. The amplitude spectrum of the two noise contri- 
butions are shows in figure [51 

The intention of this procedure is to simulate the ef- 
fects of outliers from the Gaussian noise distribution, 
which we know occur in the real interferometer data, and 
which we will be very unlikely to be able to model, mak- 
ing them in effect random. In this test we have chosen a 
Poisson distribution P{\) with a mean A of one point for 
every ten time stamps, i.e. A = 0.1. We have explored 
both the signal-free case and the situation in which a 
GW signal is added to the (Gaussian + Poissonian) noise, 
analogous of the studies presented in Secs. lIIIBl and lTTrCl 



respectively. 

Figure [6] shows the results of analysing simply the com- 
bination of Gaussian and Poisson noise, without a GW 
signal present. We can see that the change in Bayes fac- 
tor due to this additional noise component is minimal 
(cf. Figure [2]), and does not trigger the signal model 
at all, conversely it depresses the odds. The estimated 
probability of a signal being present remains low. 

Figure [7] shows the results of tests similar to those in 
Section Fill CI (GW signal and noise). In comparison to 
Figure O the recovered Bayes factors are reduced for the 
same signal-to-noise ratio - as an example, logj^Q i3sN ~ 
10 is reached for SNR w 4.6 instead of « 3.6 as in the 
pure Gaussian and stationary noise case - although the 
value of SNR for the Poisson dataset does not include 
the Poisson noise component in its noise estimate, as it 
is supposed to be a deviation from the noise model. 

The results from this test indicate that the presence of 
an unmodelled yet stochastic noise source will hinder the 
detection of an inspiral signal, but that the signal model 
would still be selected, although at a somewhat higher 
SNR. Such noise will decrease the probability of detec- 
tion, but it will also decrease the probability of a false 
alarm with this algorithm as the Bayes factor is always 
decreased. The exact quantitative consequences for the 
change of SNR level at which the GW signal model be- 
comes preferred over the noise-only model will depend on 
the actual character of the deviation from the Gaussian 
and stationary stochastic process. This can however be 
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FIG. 7: The results from adding GW signals of varying signal- 
to-noise ratio onto the combined noise data from a Poissonian 
and Gaussian distribution, as described in the text. The re- 
sulting Bayes factors are lowered in comparison to Figure [3] 

rigourously quantified by performing Monte-Carlo simu- 
lations on real data. 

Since we do not expect the algorithm to perform better 
when assumptions of the models are violated, it is desir- 
able that it would degrade in the least harmful manner. 
The results of the tests presented here are therefore reas- 
suring: the Bayes factor will not spuriously generate false 
alarms in the presence of this type of noise, but instead 
will simply not perform as well at detecting actual sig- 
nals. It is worth mentioning again that ideally one would 
also compute the evidence for this noise model, and that 
if this was done some of the sensitivity would be restored. 

IV. CONCLUSIONS 

Bayesian inference using evidence and odds ratios eval- 
uations provides a clear and justifiable means of deter- 
mining the probability of a signal being present in the 
data: it is the optimal inference and, as we have shown 
(although only in simple cases), is also robust against 
interference from non-gravitational wave signals present 
in the data, and against deviations in the noise profile. 
Through the method used in this paper the calculation of 
odds ratios between hypotheses is made feasible within 
useful time-scales. The run-time of the code that we have 



developed can vary depending on factors such as the de- 
sired accuracy, whether or not a signal is present, the 
actual GW waveform and relevant number of parameters 
that describe the model; however, to perform a single run 
of one of the analyses typical in this paper takes (much) 
less than one day on a single 2 GHz CPU. This is signif- 
icantly more efficient than other approaches considered 
so far for Bayesian model selection, e.g. [H, i40j. This 
speed makes it possible to perform thousands of odds ra- 
tios calculations per day on Beowulf- type clusters: this 
gives the method a very good combination of sensitivity 
and speed which we hope will allow the method to be 
used as one of the techniques for follow-up studies in the 
analysis of real data. 

It is our intention now to further develop this method 
towards the evaluation of odds ratios in real interferom- 
eter data, and integrate it with the existing analyses to 
provide a robust Bayesian follow-up capable of determin- 
ing the odds of a signal being present. In order to achieve 
this, we need primarily to include additional models of 
GW signals (such as post-Newtonian waveforms) and 
possibly of instrumental artefacts, which will allow the 
analysis to distinguish between different types of sources 
and to eliminate or detect contamination of the noise 
from unwanted sources. 

Another important feature of the method introduced 
in this paper which has not been discussed here but may 
be useful in a combined Bayesian analysis is its ability 
to find the maximum likelihood values of the parame- 
ters, which would integrate well with a Markov-Chain 
Monte-Carlo approach |24| to full parameter estimation, 
see e.g. [ssl. [S^. [ssl. and references therein. We intend 
to include such an interface with MCMC estimation to 
provide a combined Bayesian follow-up package for inspi- 
ral analysis. 
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